Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement
نویسندگان
چکیده
The key advantage of using multiple microphones for speech enhancement is that spatial filtering can be used to complement the tempo-spectral processing. In a traditional setting, linear (beamforming) and single-channel post-filtering are commonly performed separately. contrast, there trend towards employing (DNNs) learn joint non-linear filter, which means restriction processing model separate information potentially overcome. However, internal mechanisms lead good performance such data-driven filters multi-channel not well understood. Therefore, in this work, we analyse properties filter realized by DNN as its interdependency with temporal spectral carefully controlling sources (spatial, spectral, temporal) available network. We confirm superiority model, outperforms an oracle challenging speaker extraction scenario low number 0.24 POLQA score. Our analyses reveal particular should processed jointly increases selectivity filter. systematic evaluation then leads simple network architecture, state-of-the-art architectures on task 0.22 score 0.32 CHiME3 data.
منابع مشابه
Deep Neural Network Approach for Single Channel Speech Enhancement Processing
..................................................................................................................................... ii Acknowledgements .................................................................................................................. iii Table of contents .............................................................................................................
متن کاملMulti-channel psychoacoustically motivated speech enhancement
Multichannel techniques offer advantages in noise reduction and overall output signal quality when compared to the well studied mono approaches. In this paper we present an original multichannel psychoacoustically motivated noise reduction algorithm that naturally extends the single channel psychoacoustic masking filter previously studied in the literature [1]. The optimality criterion is desig...
متن کاملMulti-Modal Hybrid Deep Neural Network for Speech Enhancement
Deep Neural Networks (DNN) have been successful in enhancing noisy speech signals. Enhancement is achieved by learning a nonlinear mapping function from the features of the corrupted speech signal to that of the reference clean speech signal. The quality of predicted features can be improved by providing additional side channel information that is robust to noise, such as visual cues. In this p...
متن کاملUtilizing Kernel Adaptive Filters for Speech Enhancement within the ALE Framework
Performance of the linear models, widely used within the framework of adaptive line enhancement (ALE), deteriorates dramatically in the presence of non-Gaussian noises. On the other hand, adaptive implementation of nonlinear models, e.g. the Volterra filters, suffers from the severe problems of large number of parameters and slow convergence. Nonetheless, kernel methods are emerging solutions t...
متن کاملImproved Multi-band Spectral Subtraction Method for Speech Enhancement
In this paper, we propose a new approach to improve the performance of speech enhancement technique based on multi-band spectral subtraction for white Gaussian noise. First, the original power spectral subtraction and multiband spectral subtraction methods are surveyed and implemented. Next, the generalization is applied on multiband spectral subtraction. Finally, the flattened noise spectrum w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2023
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2022.3221046